Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

CLASSIFICATION OF MACHINE-PRINTED AND HANDWRITTEN TEXTS USING CHARACTER BLOCK LAYOUT VARIANCE

Identifieur interne : 002145 ( Main/Exploration ); précédent : 002144; suivant : 002146

CLASSIFICATION OF MACHINE-PRINTED AND HANDWRITTEN TEXTS USING CHARACTER BLOCK LAYOUT VARIANCE

Auteurs : Kuo-Chin Fan [République populaire de Chine] ; Liang-Shen Wang [République populaire de Chine] ; Yin-Tien Tu [République populaire de Chine]

Source :

RBID : ISTEX:B565160A87E511684422B53594B5BC62A067B109

Abstract

Machine-printed and handwritten texts always intermixedly appear in several kinds of documents, such as form documents. The classification of machine-printed and handwritten texts is thus a prerequisite to facilitate later optical character recognition task. In this paper, we will present a machine-printed and handwritten text classification method to automatically identify the identity of texts segmented from a document image. In our approach, the orientation of a text block is first divided into horizontal or vertical direction by analyzing the widths of valleys of X and Y projection profiles of a text block image. Then, a reduced X–Y cut algorithm is utilized to obtain the base blocks from a text block image. Last, the spatial feature, character block layout variance, is devised to achieve the classification goal. Our method can be applied to either English or Chinese document images. Experimental results reveal the feasibility of our proposed method in classifying handwritten and machine-printed texts.

Url:
DOI: 10.1016/S0031-3203(97)00143-X


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>CLASSIFICATION OF MACHINE-PRINTED AND HANDWRITTEN TEXTS USING CHARACTER BLOCK LAYOUT VARIANCE</title>
<author>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
</author>
<author>
<name sortKey="Wang, Liang Shen" sort="Wang, Liang Shen" uniqKey="Wang L" first="Liang-Shen" last="Wang">Liang-Shen Wang</name>
</author>
<author>
<name sortKey="Tu, Yin Tien" sort="Tu, Yin Tien" uniqKey="Tu Y" first="Yin-Tien" last="Tu">Yin-Tien Tu</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:B565160A87E511684422B53594B5BC62A067B109</idno>
<date when="1998" year="1998">1998</date>
<idno type="doi">10.1016/S0031-3203(97)00143-X</idno>
<idno type="url">https://api.istex.fr/document/B565160A87E511684422B53594B5BC62A067B109/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000880</idno>
<idno type="wicri:Area/Istex/Curation">000870</idno>
<idno type="wicri:Area/Istex/Checkpoint">001649</idno>
<idno type="wicri:doubleKey">0031-3203:1998:Fan K:classification:of:machine</idno>
<idno type="wicri:Area/Main/Merge">002262</idno>
<idno type="wicri:Area/Main/Curation">002145</idno>
<idno type="wicri:Area/Main/Exploration">002145</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">CLASSIFICATION OF MACHINE-PRINTED AND HANDWRITTEN TEXTS USING CHARACTER BLOCK LAYOUT VARIANCE</title>
<author>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
<affiliation wicri:level="1">
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Information Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Wang, Liang Shen" sort="Wang, Liang Shen" uniqKey="Wang L" first="Liang-Shen" last="Wang">Liang-Shen Wang</name>
<affiliation wicri:level="1">
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Information Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Tu, Yin Tien" sort="Tu, Yin Tien" uniqKey="Tu Y" first="Yin-Tien" last="Tu">Yin-Tien Tu</name>
<affiliation wicri:level="1">
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Information Engineering, National Central University, Chung-Li, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Pattern Recognition</title>
<title level="j" type="abbrev">PR</title>
<idno type="ISSN">0031-3203</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1997">1997</date>
<biblScope unit="volume">31</biblScope>
<biblScope unit="issue">9</biblScope>
<biblScope unit="page" from="1275">1275</biblScope>
<biblScope unit="page" to="1284">1284</biblScope>
</imprint>
<idno type="ISSN">0031-3203</idno>
</series>
<idno type="istex">B565160A87E511684422B53594B5BC62A067B109</idno>
<idno type="DOI">10.1016/S0031-3203(97)00143-X</idno>
<idno type="PII">S0031-3203(97)00143-X</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Machine-printed and handwritten texts always intermixedly appear in several kinds of documents, such as form documents. The classification of machine-printed and handwritten texts is thus a prerequisite to facilitate later optical character recognition task. In this paper, we will present a machine-printed and handwritten text classification method to automatically identify the identity of texts segmented from a document image. In our approach, the orientation of a text block is first divided into horizontal or vertical direction by analyzing the widths of valleys of X and Y projection profiles of a text block image. Then, a reduced X–Y cut algorithm is utilized to obtain the base blocks from a text block image. Last, the spatial feature, character block layout variance, is devised to achieve the classification goal. Our method can be applied to either English or Chinese document images. Experimental results reveal the feasibility of our proposed method in classifying handwritten and machine-printed texts.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>République populaire de Chine</li>
</country>
</list>
<tree>
<country name="République populaire de Chine">
<noRegion>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
</noRegion>
<name sortKey="Tu, Yin Tien" sort="Tu, Yin Tien" uniqKey="Tu Y" first="Yin-Tien" last="Tu">Yin-Tien Tu</name>
<name sortKey="Wang, Liang Shen" sort="Wang, Liang Shen" uniqKey="Wang L" first="Liang-Shen" last="Wang">Liang-Shen Wang</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002145 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002145 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:B565160A87E511684422B53594B5BC62A067B109
   |texte=   CLASSIFICATION OF MACHINE-PRINTED AND HANDWRITTEN TEXTS USING CHARACTER BLOCK LAYOUT VARIANCE
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024